Thirty Days of Metal — Day 22: Instancing
This series of posts is my attempt to present the Metal graphics programming framework in small, bite-sized chunks for Swift app developers who haven’t done GPU programming before.
If you want to work through this series in order, start here. To download the sample code for this article, go here.
So far, there has been a one-to-one relationship between nodes and draw calls: we encode one draw call per mesh (or really, submesh). Suppose we want to render the same mesh many times. We could certainly write a loop that encodes one draw call for each time we want to render the mesh. However, encoding isn’t a cheap operation, so reducing draw call count is desirable.
It turns out Metal supports a feature that allows us to draw many meshes with a single draw call: instancing. With instancing, we tell Metal how many times to draw a mesh, while providing per-instance data in a buffer.
This per-instance data is just like the per-node constant data we have used in our shaders previously. The only thing that changes is how we access it.
In the following sections, we will see how to organize batches of nodes so they can be rendered with instanced draw calls, increasing efficiency while allowing us to vary the animated transformations of each instance independently.
The Sample Scene
We will build an asteroid field as our sample scene to demonstrate instancing. The background will be a skybox textured with distant stars, while the foreground will consist of about 100 gently tumbling asteroids. Each asteroid uses one of three meshes:
Each asteroid has a unique position, scale, and rotation parameters. Asteroids are batched together so that only one instanced draw call per mesh is required.
Loading Mesh Assets
Our asteroid mesh assets are stored in three files: asteroid1.obj, asteroid2.obj, and asteroid3.obj. This naming scheme allows us to iterate over the file names using string interpolation:
for asteroidID in 1...3 {
let assetURL = Bundle.main.url(
forResource: "asteroid\(asteroidID)",
withExtension: "obj")
let mdlAsset = MDLAsset(
url: assetURL,
vertexDescriptor: mdlVertexDescriptor,
bufferAllocator: bufferAllocator)The remainder of the code for loading the asset’s mesh and converting it to an MTKMesh remains the same.
Creating Node Batches
For each asteroid mesh, we will create many asteroid nodes that are grouped into a batch. A batch is a set of nodes that share a mesh and texture while having varying transforms.
We will store each asteroid’s state separate from its node, in an AsteroidState object. This allows us to define custom properties such as the asteroid’s rotation axis and angular velocity (rotation rate). We then derive the corresponding node’s transform from these properties.
class AsteroidState {
var position = SIMD3<Float>(0, 0, 0)
var scale: Float = 1.0
var rotationAxis = SIMD3<Float>(0, 1, 0)
var rotationAngle: Float = 0.0
var angularVelocity: Float = 0.0
}We define a few properties on our renderer to hold our batches, our asteroid nodes, and our asteroid state objects:
var batches = [[Node]]()
var asteroidNodes = [Node]()
var asteroids = [AsteroidState]()For each mesh, we want to create a batch of asteroids, each of which has unique, randomly generated properties. Specifically, we generate a random position, a random rotation axis, a random rotation rate, and a random scale.
let asteroidsPerMesh = 50
var batch = [Node]()
for _ in 0..<asteroidsPerMesh {
let asteroid = AsteroidState()
asteroid.position = SIMD3<Float>(Float.random(in: -10...10),
Float.random(in: -10...10),
Float.random(in: -20...0))
asteroid.rotationAxis =
normalize(SIMD3<Float>(Float.random(in: -1...1),
Float.random(in: -1...1),
Float.random(in: -1...1)))
asteroid.rotationAngle = 0.0
asteroid.angularVelocity = Float.random(in: -0.5...0.5)
asteroid.scale = Float.random(in: 0.6...1.2) asteroids.append(asteroid)
At the same time, we create a node for each asteroid and add it to the current batch, as well as the list of asteroids, which we use later when animating the asteroids.
let node = Node(mesh: mesh)
node.texture = texture
asteroidNodes.append(node)
batch.append(node)
}Once we are done generating a batch of asteroids, we add it to our collection of batches, which we will iterate over when rendering.
batches.append(batch)Updating Per-Instance Constants
Updating the per-instance data is similar to how we have previously updated per-node constants. First, we animate the asteroids according to their individual properties. Then, we copy the updated transforms into the constant buffer.
Updating the node state consists of incrementally rotating each asteroid around its randomized axis of rotation according to its rotational velocity, then building its transformation from its translation, scale, and rotation properties:
for (asteroidIndex, asteroidState) in asteroids.enumerated() {
asteroidState.rotationAngle +=
asteroidState.angularVelocity * timestep let S = simd_float4x4(scale:
SIMD3<Float>(repeating: asteroidState.scale))
let R = simd_float4x4(rotateAbout: asteroidState.rotationAxis,
byAngle: asteroidState.rotationAngle)
let T = simd_float4x4(translate: asteroidState.position)
asteroidNodes[asteroidIndex].transform = T * R * S
}
We need to store a constant buffer offset for each batch. Before allocating constant storage for the batches, we first clear the offsets from the previous frame.
batchConstantsOffsets.removeAll()Then, we iterate over the batches, copying the model-to-world transformations of each of the batch’s nodes into the constant buffer.
for batch in batches {
let layout = MemoryLayout<InstanceConstants>.self
let offset = allocateConstantStorage(
size: layout.stride * batch.count,
alignment: layout.stride) let instanceConstants = constantBuffer.contents()
.advanced(by: offset)
.bindMemory(to: InstanceConstants.self,
capacity: batch.count) for (nodeIndex, node) in batch.enumerated() {
instanceConstants[nodeIndex] =
InstanceConstants(modelMatrix: node.worldTransform)
} batchConstantsOffsets.append(offset)
}
Instancing in the Vertex Function
We will use the same type of constant data for each instance that we used for each node previously: the model-to-world transformation matrix. To emphasize that this data will be fetched per-instance, we name the structure InstanceConstants:
struct InstanceConstants {
float4x4 modelMatrix;
};Now, rather than taking a reference to a single NodeConstants structure, we take a pointer to an array of InstanceConstants structures, along with the built-in instance ID:
vertex VertexOut vertex_main(
VertexIn in [[stage_in]],
constant InstanceConstants *instances [[buffer(2)]],
constant FrameConstants &frame [[buffer(3)]],
uint instanceID [[instance_id]])
{Taking a pointer rather than a reference allows us to index into the instance constants buffer to retrieve the constants for a single instance:
constant InstanceConstants &instance = instances[instanceID];We can then use the instance’s model matrix just as we have been using the node’s model matrix:
float4x4 modelViewMatrix = frame.viewMatrix * instance.modelMatrix;The remainder of the vertex function is unchanged.
Instanced Draw Calls
The heart of our instanced drawing routine is the drawIndexedPrimitives(type: indexCount:indexType:indexBuffer:indexBufferOffset: instanceCount:). This tells the encoder to encode an instanced, indexed draw call, which will draw instanceCount instances of whatever mesh is being rendered.
To draw each batch of asteroids, we iterate over our array of batch arrays. For each batch, we select the first node and mesh as “representative,” and use them to bind the resources that will be used to draw the mesh.
for (batchIndex, batch) in batches.enumerated() {
guard let node = batch.first else { continue }
guard let mesh = node.mesh else { continue }We also need to bind the instance constants, which we do by using the previously calculated per-batch constants offsets.
renderCommandEncoder.setVertexBuffer(
constantBuffer,
offset: batchConstantsOffsets[batchIndex],
index: 2)The rest of our MTKMesh rendering code is unchanged, apart from the addition of the instanceCountparameter:
renderCommandEncoder.drawIndexedPrimitives(
type: submesh.primitiveType,
indexCount: submesh.indexCount,
indexType: submesh.indexType,
indexBuffer: indexBuffer.buffer,
indexBufferOffset: indexBuffer.offset,
instanceCount: batch.count)Running the sample app shows our many, diverse asteroids a-tumbling in the depths of space:
Now that we have a dense, perilous field of asteroids, it would be a shame not to be able to fly through them. In the next article, we will introduce interactivity and camera controllers.